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Abstract 

The  simulation  result  of  Nunes,  Kuan,  and  Newbold  suggests  that  it  is  possible  to 
estimate  a  spurious  break  for  a  regression  model  with  1(1)  disturbances.  In  this  note, 
we  provide  a  rigorous  proof  for  this  phenomenon.  We  also  show  that  their  finding 
applies  to  integrated  regressors,  so  that  a  spurious  regression  may  lead  to  a  spurious 
break.  However,  if  two  integrated  processes  are  cointegrated  with  a  structural  change 
in  the  cointegrating  relationship,  the  break  point  can  be  consistently  estimated.  The 
consistency  is  in  terms  of  the  integer  index  rather  than  in  terms  of  the  sample  fraction. 
This  rapid  rate  of  convergence  is  not  attainable  for  stationary  or,  more  generally,  for 
1(0)  regressors.  Furthermore,  the  consistency  holds  even  when  magnitudes  of  breaks 
are  small  but  do  not  converge  to  zero  too  fast.  These  consistency  results  are  also 
obtained  for  a  broken  trend  model. 


Key  words  and  phrases.    Spurious  break,  spurious  regression,  change  point,  cointe- 
gration,  broken  trend. 
Running  head:  Spurious  Break 


1      Introduction 

Recently,  Nunes,  Kuan,  and  Newbold  (1995)  pointed  out  that  when  the  disturbances 
of  a  regression  model  follow  an  1(1)  process  there  is  a  tendency  to  estimate  a  break 
point  in  the  middle  of  the  sample,  even  though  a  break  point  does  not  exist.  This 
phenomenon  is  called  a  "spurious  break"  by  the  authors  and  was  discovered  by  a 
simulation  experiment.  In  this  note,  we  provide  a  rigorous  proof  for  this  phenomenon. 
Furthermore,  we  show  that  a  spurious  break  occurs  for  1(1)  regressors  as  well,  so  that 
a  spurious  regression  may  lead  to  a  spurious  break. 

We  then  consider  the  problem  in  which  the  dependent  variable  and  the  1(1)  re- 
gressors are  cointegrated  but  the  cointegrating  relationship  undergoes  a  shift.  This  is 
a  more  general  notion  of  cointegration  because  the  cointegrating  vector  is  not  time- 
invariant.  We  ask  the  question  whether  the  break  point  can  be  consistently  estimated. 
It  is  shown  that  the  estimated  break  point  converges  quickly  to  the  true  break  point. 
The  rate  of  convergence  is  faster  than  the  corresponding  result  for  1(0)  regressors, 
given  the  same  magnitude  of  shift. 

A  structural  change  in  a  cointegrating  relationship  can  be  a  useful  model  in  em- 
pirical applications.  Cointegration  describes  a  long-run-equilibrium  condition.  An 
equilibrium  may  be  disturbed  by  policy  regime  changes,  resulting  in  a  new  equilib- 
rium, so  that  a  different  cointegrating  vector  may  be  needed  to  characterize  this  new 
equilibrium.  A  special  case  is  a  shift  in  the  mean  level  of  the  long-run  equilibrium, 
which  can  be  expressed  as  a  shift  in  the  intercept  of  a  cointegrating-regression  model. 
This  shift  is  exhibited  graphically  as  a  change  in  the  "gap"  between  two  cointegrated 
series. 

Although  not  a  concern  of  this  paper,  we  point  out  that  testing  for  cointegration 
which  allows  for  a  structural  change  has  been  studied  by  a  number  of  authors;  see,  e.g., 
Hansen  (1992),  Quintos  and  Phillips  (1993),  Gregory  and  Hansen  (1996),  and  Campos, 
Ericsson,  and  Hendry  (1996).  One  implication  of  a  structural  change  in  a  cointegrating 
relationship  is  that  one  may  not  be  able  to  reject  the  null  of  no  cointegration  if 
conventional  tests  are  used,  even  though  a  long-run  relationship  between  two  series 


does,  in  fact,  exist.  This  situation  calls  for  use  of  the  test  statistics  proposed  by  the 
aforementioned  authors. 

2      Spurious  Break 

Consider  the  model: 

(  x'tPi  +  et  t  =  1,2,...,* 

Vt  =  { 

{  x't/32  +  et  t  =  k  +  l,...,T 

Let  $\{k)  be  the  least  squares  estimator  of  f3\  based  on  the  first  k  observations,  and 
$2(k)  be  the  least  squares  estimator  of  #2  based  on  the  last  T  —  k  observations,  i.e., 

k 


Pi(k)  =  {J2xtx't)    (J2xtyt), 


T  .  _,   .      T 


&(*)  =  (  E  **<)'  (  E  *&)• 

t=k+l  t=k+l 

Define  the  sum  of  squared  residuals  for  the  full  sample  as 


k  2  T 


sT(k)  =  Y,(yt-x'Mk))  +  E  (yt-x'Mk)) 

t=\  t=k+l 

and  define  the  break  point  estimator  as 

k  =  argmin1<fc<TS,T(A:). 

Finally,  let 

AT  =  min{A  :  A  =  argminue[AiI]5T([Tu]) 

where  0  <  A  <  A  <  1.  The  behavior  of  Aj  is  considered  for  two  cases:  1(0)  and  1(1) 
error  processes. 

For  an  1(0)  error  process,  let  Q(X)  and  R(\)  be  defined  as  in  [Al]  and  [A3]  of  Nunes, 
Kuan,  and  Newbold  (1995,  hereafter  NKN),  respectively.  More  specifically,  Q(X)  is 
the  limit  of  Dj      (Y/t=i  xtx't)D^       for  an  appropriate  scaling  matrix  Dj,  and  R(\)  is 

1    In  Xrp  \  1 

the  limit  of  DT  52t=i  xt^t-  The  matrix  Q(X)  is  assumed  to  be  positive- definite  and 
strictly  increasing.  The  process  R(X)  is  Gaussian.  When  xt  is  stationary,  Q(X)  =  \Q 
for  some  Q  >  0  and  -R(A)  is  a  Brownian  motion. 


In  this  section,  we  assume  there  is  no  break,  i.e.   j3\  =  /?2-    NKN  show  that  (see 
their  Theorem  3.1b) 

\e[A,A] 

where  M(X)  is  a  stochastic  process  given  by 


AT  -i  argmaxAe[ATiM(A)  (1) 


M(A)  =  R{\)'Q{X)-XR{\)  +  [i?(l)  -  R(X)]'[Q(1)  -  Q^m)  -  /2(A)].        (2) 

For  an  1(1)  error  process  et,  we  assume  that  T~2  J2t=i  £2  =>•  c/0A  W^2(u)g?u  with 
W(u)  being  a  standard  Wiener  process,  and  c  >  0  a  constant.  Further  assume  (see 
[A3']  in  NKN1),  for  some  a  >  0, 

[XA] 

T-*/2/^1/2  £><£t  ^  G(A)  (3) 

where  G(A)  is  Gaussian  process.  NKN  prove  that 

AT  -i  argmaxAe[AjX]M*(A)  (4) 

where 

M*(A)  =  G(A)'g(A)"1G(A)  +  [G(l)  -  G(X)]'[Q(l)  -  ^(A)]-1^!)  -  G(A)].       (5) 

Examining  (2)  and  (5)  we  find  that,  whether  the  error  process  is  1(0)  or  1(1),  the 
results  are  essentially  the  same.  Namely,  in  the  absence  of  a  break,  the  estimated 
break  point  At  is  a  random  variable  with  support  in  [A ,  A] .  Not  much  further  can  be 
said  on  the  compact  interval  [A,  A].  However,  for  /(0)  error  process  et,  NKN  further 
prove  that  M(A)  — >  oo  as  A  — >  0  or  1,  thus  At  — ►  {0, 1},  if  A  — ►  0  and  A  — >  1;  also  see 
Andrews  (1993).  In  their  Remark  1  (p.  742),  NKN  pointed  out  that  they  were  unable 
to  characterize  the  limiting  behavior  of  M*(X)  for  A  near  0  or  1.  Through  simulation, 
they  find  that  M*(A)  behaves  differently  from  M(X).  More  specifically,  M*(X)  does 
not  diverge  to  infinity  as  A  decreases  to  zero  or  increases  to  1. 


1  Their  original  assumption  is  stated  in  terms  of  yt  rather  than  et,  which  applies  to  yt  being  1(1). 
The  current  form  allows  j/<  to  depend  on  deterministic  regressors  as  well  as  on  an  additive  1(1)  error 
process. 


In  the  following,  we  shall  prove  that  M*(X)  is  a  well  defined  process  on  [0, 1]  and  is 
uniformly  bounded  in  probability  over  [0, 1].  Note  that  M*(X)  is  the  limiting  process 
of  T'aMT ([TA]),  where  a  is  defined  in  (3)  and 

k  ,       k  _j       k  T  ,         T  -1         T 

M£(k)=  (J2£txt)  (Ex<x0   {J2xt£t)  +  {  E  etxt)  (  E  x*x0   (  E  x<e<) 

t=l  <=1  t=l  t=k+l  t=k+l  t=k+l 

(6) 
We  shall  assume  that  a  >  2  because  this  is  true  when  xt  contains  a  nonzero  mean 
regressor  (e.g.,  a  constant,  or  a  trend).  When  xt  is  a  7(0)  process  with  zero  mean,  it 
is  possible  that  a  =  1.  This  case  is  not  considered  in  this  paper. 

Theorem  1  For  the  a  defined  in  (3),  assume  a  >  2.    We  have 

sup  M*(A)  =  Op(l).  (7) 

Ae(o,i) 

Proof  of  Theorem  1.  For  an  arbitrary  vector  z  and  an  arbitrary  projection  matrix 
P,  we  have  z' Pz  <  z'z.  Apply  this  inequality  to  Mj{k)  to  obtain 

T 

MT(k)<J2£t  forallJfc€[l,T].  (8) 

Since  a  >  2,  we  have  T~aM$(k)  <  T~2  £f=1  e]  for  all  k  E  [1,T].  Moreover,  because 
T~2  J2t=i  £t  has  a  hmit,  T~a Mj{k)  is  uniformly  bounded  in  probability.  Thus  its  limit, 
M*(A),  is  uniformly  bounded  in  probability  for  A  £  (0, 1).  □ 

To  rule  out  the  possibility  that  Xt  — >  {0,1},  we  need  to  further  examine  the 
behavior  of  M*(A)  for  A  near  0  and  1.  Strictly  speaking,  M*(A)  is  not  defined  yet  at 
A  =  0  and  A  =  1.  As  the  limit  of  M*(X)  when  A  —>  0,  M*(0)  should  be  defined  as 

M*(0)  =  GilYQil^Gil)  (9) 

which  is  obtained  from  (5)  by  taking  G{X)  =  0,Q(A)  =  0,  and  C(A)'g(A)-1G(A)  =  0 
for  A  =  0.  Note  that  the  term  G(A)'Q(A)-1G(A)  is  the  limit  of  the  first  term  of  (6)  on 
the  right  hand  side  divided  by  T~a .  Now 


HEWE^r'E^)  <  r^e?  <  r-2^£t2 


2 

i=l  i=l  t=l  t=\  4=1 


which  converges  to  zero  in  probability  for  any  given  k,  or  for  k  =  [TX]  with  A  — >  0.  It 
follows  that  G(A)'(5(A)-1G(A)  ->  0  in  probability  as  A  ->  0.  Thus  the  definition  of  (9) 
is  the  limit  of  M*(A)  as  A  —*  0.  Similarly,  we  can  define,  as  the  limit  of  M*(A)  when 
A  — »  1,  M*(l)  =  M*(0).  We  next  show  that  the  maximum  of  M*(X)  is  not  attained 
at  0  or  1. 

Theorem  2   (i)  With  probability  1, 

M*(0)  =  M*(l)  <  M*(A),         /or  every  0  <  A  <  1.  (10) 

fiij  If  G{X)  has  a  nonsingular  covariance  function,  then  with  probability  1 

M*(0)  =  M*(l)  <  M*(A),        /or  every  0  <  A  <  1.  (11) 

To  prove  Theorem  2,  we  need  the  following  lemma. 

Lemma  1  For  arbitrary  positive-definite  matrices  A  and  B  with  A  >  B  (p  x  p),  and 
arbitrary  vectors  x  and  y  (p  x  1),  we  have 

x'A~xx  -  y'B~xy  -  (x  -  y)\A  -  B)~l(x  -  y)  <  0.  (12) 

Proof  of  Lemma  1 :  Define  the  matrix 

(  (A-B)-i-A-*        -(A-B)-*      \ 
^  ~  ^       -(A-B)'1        {A-By'  +  B-1  J'  [l6) 

It  suffices  to  prove  H  to  be  positive-semidefinite  because  the  left  hand  side  of  (12)  is 
equal  to  -z'Hz  for  z'  =  (x',y').  Let  D  =  (A  -  B)'1  +  B'1  >  0.  Let  C  be  a  matrix 
with  the  first  p  rows  (/,  (A  —  B)~lD~*)  and  second  p  rows  (0,  /).  Using  the  identity 


-l 


{A  -  B)-1  -  A-1  =  (A-  B)-1D~1{A  -  B) 

to  obtain 

C'HC  =  diag(0,  D)  >  0. 

Thus  C'HC  is  positive-semidefinite,  so  is  H  because  C  has  full  rank.  This  proves  the 
lemma.        □ 


Proof  of  Theorem  2.  The  inequality  M*(0)  <  M*(A)  is  equivalent  to 
G(l)'g(l)-1G(l)-G(A)'g(A)-1G(A)-[G(l)-G(A)]'[Q(l)-Q(A)]-1[G(l)-G(A)]<0. 

Clearly,  part  (i)  of  Theorem  2  follows  from  Lemma  1  by  letting  A  =  Q{1),  B  =  Q(X), 
x  =  G(l),  and  y  =  G(A).  Next,  consider  (ii).  Let  A  =  Q(l)  and  B  =  Q(X)  and 
let  H  be  defined  in  (13).  Then  M*(0)  <  M*(A)  is  equivalent  to  -£'#£  <  0,  where 
£  =  (G(l)',  G(A)')'.  Let  T  be  an  orthogonal  matrix  such  that  T'HT  =  diag(Al5 ...,  A2p) 
with  Ai  >  A2  >  •  •  •  A2p,  where  X^s  are  the  eigenvalues  of  H.  Since  H  >  0  and  H  ^  0, 
the  maximum  eigenvalue  of  i/  is  positive.  It  follows  that 

-?Ht  =  -(rO'diag(A1,...,A2p)r^  <  -r)2^ 

where  rj  is  the  first  component  of  T£.  When  G(X)  has  a  nonsingular  covariance  matrix, 
so  does  £.  Thus  T£  is  a  vector  of  normal  variables  with  a  nonsingular  covariance  matrix, 
implying  — /;2A1  <  0  with  probability  1  because  P(rj2  =  0)  =  0.  That  is,  —£'H£  <  0 
with  probability  1.  □ 

The  above  analysis  applies  to  1(1)  regressors  as  well,  which  is  not  considered  by 
NKN.  Let 

yt  =  x'tfi_+  et 

where  both  xt  and  ef  are  1(1),  so  that  we  have  a  spurious  regression.  Assume 

[TX] 

T-2J2xtx't^Q(X)  (14) 

t=i 

where  Q(X)  is  a  stochastic  matrix  with  Q(X)  >  0  (a.s.)  and  Q(X)  is  strictly  increasing, 

i.e.  Q(u)  —  Q(v)  >  0  with  probability  1  for  v  <  u.  Also  assume 

[TX] 

T-2J2xtet^G(X)  (15) 

t=i 

where  G(-)  is  a  vector  of  random  processes,  and  for  each  A,  G(X)  possesses  a  density 

function  and  a  nonsingular  covariance  matrix. 

Theorem  3  The  results  of  Theorem  1  and  Theorem  2  apply  to  1(1)  regressors  satis- 
fying (14)  and  (15),  so  that  spurious  regression  leads  to  a  spurious  break. 


Proof  of  Theorem  3.  First,  (14)  and  (15)  imply  (3)  with  DT  =  T2I  and  a  =  2.  The 
proof  of  (7)  under  the  new  setting  is  identical  to  the  previous  proof  because  inequality 
(8)  is  a  pure  mathematical  inequality  and  holds  for  arbitrary  xt.  To  prove  (10),  we 
only  need  to  note  that  if  A,  B,  and  A  —  B  are  stochastic  matrices  that  are  positive- 
definite  with  probability  1,  then  inequality  (12)  holds  with  probability  1.  The  rest  of 
the  proof  is  virtually  identical  to  that  of  Theorem  2.  □. 

It  remains  an  open  question  whether  a  spurious  break  arises  when  yt  and  xt  are 
cointegrated.  We  conjecture  that  a  spurious  break  will  not  occur  because  an  1(1)  error 
process  is  responsible  for  its  occurrence. 

3     Regime  Shift  in  Cointegrating  Relationships 

In  the  previous  section  we  show  that  two  integrated  processes  that  are  not  cointe- 
grated may  give  rise  to  a  spurious  break.  What  happens  when  the  two  processes  are 
cointegrated  but  the  cointegrating  relationship  undergoes  a  shift?  Can  the  shift  point 
be  consistently  estimated  in  the  presence  of  1(1)  regressors? 

The  issue  of  a  structural  change  in  cointegrating  relationships  is  of  considerable 
interest.  Cointegration  describes  a  system's  long-run  equilibrium  condition.  A  system 
may  have  multiple  long-run  equilibria  with  an  occasional  shift  from  one  equilibrium 
to  another.  A  structural  change  model  allows  us  to  describe  such  a  system.  Consider, 

{ai+jixt  +  et  t=  1,2,  ...,k0 

(16) 
C*2  +  72Z<  +  £t  t  =  fc0  +  l,...,T 

where  xt  =  Xt-i  +  et  with  x0  =  0  and  Var(et)  >  0.  We  assume  et  are  et  are  1(0)  linear 
processes  such  that  et  =  Yl'jLo  Q-jVt-j  and  et  =  Y^jLo^j(,t-ji  where  rjt  and  £t  are  i.i.d. 
sequences  with  finite  4  +  8  (8  >  0)  moments,  and  X2jj|aj|  <  oo  and  J2jj\bj\  <  co.  In 
addition,  we  assume  that  fc0  =  [Tr0]  for  some  r0  G  (0, 1).  When  71  =  72,  but  «i  ^  a2, 
there  is  a  shift  in  the  mean  of  the  long-run  equilibrium.  This  intercept  shift  is  often 
visualized  as  a  change  in  the  "gap"  between  two  cointegrated  series.  When  71  ^  72, 
there  is  a  shift  in  the  cointegrating  relationship.  Let  f3i  =  (0:1,71)'  and  /32  =  (02,72)' 
and  8  =  (ati  -  a2,7i  -  72)'.  Let  Xt  =  (l,xt)'. 


It  is  evident  that  the  larger  the  magnitude  of  a  shift,  the  easier  it  is  to  identify  the 
break.  The  rate  of  convergence  for  the  estimated  break  point  depends  not  only  on  the 
magnitude  of  the  shift  in  the  coefficients,  but  also  on  the  magnitude  of  the  regressors. 
For  stationary  Xt,  the  rate  depends  on  the  "effective  magnitude  of  shift",  8'Q6/cr^, 
where  Q  =  E{XtX[).  In  this  case,  it  can  be  shown  that  k  =  k0  +  Op(l).  This  is  the 
best  rate  that  can  be  achieved  for  stationary  regressors;  see,  e.g.,  Bai  (1994,  1995). 
Furthermore,  k  itself  is  not  consistent  for  ko,  although  in  terms  of  the  sample  fraction, 
k/T  converges  at  a  rate  of  T  to  r0.  With  1(1)  regressors,  we  shall  show  that  even 
k  becomes  consistent  for  A:0-  This  follows  because  the  "effective  magnitude  of  shift" 
converges  to  infinity.  More  specifically,  if  71  ^  72,  then  a(t)  =  6' 'E{XtX[)8 / '  o~\  — >  00, 
as  t  — »  00.  In  particular,  a(ko)  — >  00  as  T  — *  00.  Consequently,  one  can  estimate  the 
break  point  more  precisely  than  with  1(0)  regressors.  In  particular,  k  =  ko  +  Op(l); 
see  Bai,  Lumsdaine  and  Stock  (1994)  for  a  proof.  Based  on  this  fact,  we  can  establish 
a  more  interesting  result.  Namely,  P(k  =  k0)  — >  1. 

Theorem  4  Let  k  denote  the  least  squares  estimator  ofko-  Assuming  that  ko  =  [Tto]. 
For  the  shifted- cointegrating  relationship  of  (16)  with  71  7^  72,  we  have 

P(k  =  k0)  ->  1. 

Proof  of  Theorem  4.  Because  k  =  ko  +  Op(l),  for  any  e  >  0,  there  exist  an  M  <  00 
such  that  P(\k  -  k0\  >  M)  <  t.  Thus 

P(k  +  ko)    =    P{\k-ko\>M)  +  P(\k-ko\<M,k^ko) 

<    e  +  P(keDM)  (17) 

where  Dm  =  {k  :  \k  —  k0\  <  M,  k  /  k0}.  By  definition,  Srik)  <  J2t=i  £t-   ^  follows 
that  the  event  {k  £  Dm}  implies  that  {mmkeDM  Sj(k)  <  YlJ=i  £?}-  Thus 

T 

P(k  e  Dm)  <P(  mm  ST(k)<J2tf)-  (18) 

v  '  \  keDM  j      ' 

We  show  that  the  right  hand  side  of  (18)  converges  to  zero  for  every  given  M.  Since  Dm 
is  a  finite  set,  it  is  sufficient  to  show  that  for  each  k  G  Dm,  P{ST(k)  <  Ylt=i  £t)  ~~*  0- 
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We  prove  this  for  k  <  kQ.  The  case  of  k  >  ko  is  similar.  Define 

Y1<k      =    (yi,...,yk)',  Ykj     =    (y*+i,-,w)' 

Xi,k     =    (Xi,...,Xk)',        Xk,r    =    pGfc+i,...,Xr)' 
£i,fc       =     (ei, .-.,  £*:)',  £k,T      =     (£fc+i,...,£x)' 

^*:,A:o       =       (Xk+l,..-,Xk0)',      X^T       =       (Xk+l,...,Xko,0,...,0)'. 

Furthermore,  let  Mx  =  I  -  X^X^xjk)^ X1Je  and  M2  =  I  -  X'KT{X'KTXk,T)-1  Xk,T. 
Then  for  k  <  k0,  we  have 

The  sum  of  squared  residuals  Sr(k)  is  given  by 

Sr(fc)    =    n'^F!,,  +  yfc'iTM2n,T 

=    £i,fcMl£l,fc  +  e^Maejb.T  +  2^'Xfc*T'M2e,,T  +  8' X^' M2Xt<T8 

T 

=    Eet  ~  ei,*-xi,*(^J,*-xu)~  ^i,fc£i,fc  _  £fc,T^rfc,T(-^it,r^A:,T)~  X'kTek,T 

k0 

+28'  J2  Xtet-28X'kMXk,k0(X'ktTXk,T)~1Xk,Tek,T 

t=k+l 
ko 

+S'(    E    XtXt)t  -  $'X'k<koXWo(Xk,TXk,T)~1X'kjCoXk,k08 
t=k+l 
T 

=    Yl£2t  +aT  +  ir  +  cT  +  dT  +  eT  +  /x 


t=\ 


We  have  used  the  fact  that  XkT'ek,T  =  EtU+i  xt£t  and  XkT'Xk,T  =  X'kMXk,k0  = 
J2t=k+i  xtXv  F°r  each  k  G  Dm,  it  is  easy  to  see  that  ax,  6j,  dx,  and  /x  are  all  0p(l). 
Thus 

«?r(A;)-E£?  =  2^  E  **£«  +  *'(  E  xtX?)8  +  0p(l)  (19) 

(=1  t=fc+l  t=k+l 

For  each  A;  <  ko,  the  first  term  on  the  right  hand  side  of  (19)  is  bounded  by  Op(y/T), 
whereas  the  second  term  is  0P(T)  and  dominates  in  magnitude  the  first  term.  For 
example,  for  k  =  ko  —  1,  the  sum  involves  only  one  summand,  and 


T 

.2 


St(*)-E*? 


t=l 


=    26'Xkoeko+S'(X'kQXko)6  +  Op(l) 

=    28Asko+282(^e)eko+8l  +  28l82(^e)+8l(l2e>)2  +  °v(l)       (20) 


z=l  i=\  i=l 


0p(62Y:el)  +  (82j:eiy 


The  last  term  can  be  rewritten  as  T£f(-!=Ei_i    e,)2,  which  dominates  the  first  term 

and  converges  to  positive  infinity  with  probability  approaching  1.   This  implies  that 

for  any  e  >  0, 

P(ST(k)-^e2t<0)<e 
t=i 

for  large  T.  Combining  with  (17)  and  (18),  we  have,  for  every  e  >  0,  P{k  ^  k0)  <  2e 

for  all  large  T .  □. 

When  only  the  intercept  has  a  break  (at\  /  a2,  ji  =  72),  k  is  no  longer  consistent 
for  ko,  even  though  k/T  still  converges  to  To  at  a  rate  of  T.  This  is  because  the 
"effective  magnitude  of  shift"  stays  bounded.  The  lack  of  consistency  can  also  be  seen 
from  (20).  When  82  =  0,  ST(k)  -  ELi  e?  =  2<^£*o  +  Sl  +  0P(1),  which  cannot  be 
guaranteed  to  be  positive. 

The  underlying  reason  for  k  being  consistent  for  ko  is  not  the  1(1)  regressor  per 
se.  Roughly  speaking,  if  k  =  ko  +  Op(l)  holds,  then  for  any  set  of  regressors  Xt  such 
that  the  second  term  (19)  converges  to  infinity  and  dominates  the  first  term,  k  will  be 
consistent  for  ko-  In  particular,  this  is  true  for  polynomials  regressions.  For  simplicity, 
we  consider  the  following  broken-trend  model: 

{oi  +7i<  +  £t  t  =  1,2, ...,  Ar0 

(21) 
oc2  +  -y2t  +  et  t  =  k0  +  l,...,T 

where  et  is  a  linear  process  such  that  e<  =  Yl'jLo  ajr]t-j  with  r\t  a  sequence  of  martingale 
differences  and  sup,- Er\\  <  00,  and  Y^]LoJ\aj\  <  °°- 

Theorem  5  For  the  broken  trend  model  (21),  assume  71  =^72-    Let  k  be  the  least 
squares  estimator  of  ko  with  ko  =  [Tto]  ■   Then 

P(k  =  &o)  — *  1,  as  T  — >  00 
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Proof  of  Theorem  5.  Let  Xt  =  (1,  t)'.  Then  the  proof  of  Theorem  4  up  to  equation 
(19)  can  be  copied  here.  The  right  hand  side  (19)  still  converges  to  positive  infinity 
(with  probability  1)  for  the  newly  defined  Xt  for  each  k  £  Dm-    For  example,  for 

k  =  ko  —  1, 

StW-^s?    =    26'Xkoeko+6'(X'koXko)S  +  Op(l) 


t=i 


=    2<5iefeo  +  262k0eko  +  6\  +  28l82k0  +  (82k0)2  +  0P{1)       (22) 
=    Op{62k0)  +  (82k0)2 
=    Op(62Tt0)  +  (82T)2t* 

It  follows  that  the  second  term  above  dominates  the  first  and  converges  to  positive 
infinity.  This  implies  that,  for  each  k  <E  DM,  P(ST(k)  -  ELi  £?  <  0)  ^  0.  □ 

Although  Theorem  5  is  not  explicitly  presented  in  the  literature,  it  is  not  unex- 
pected. In  Bai  (1994,  1995),  the  linear  trend  is  written  as  ^  and  it  is  proved  that 
k  =  ko  +  Op(l)  (but  k  is  not  consistent  for  ko)-  If  we  rewrite  the  broken  trend  model 
(21)  with  the  linear  trend  expressed  in  the  format  ^r,  then  the  new  slope  coefficients 
become  Tji  for  t  <  ko  and  Tj2  for  t  >  k0.  The  magnitude  of  shift  will  be  T(~/2  —  71). 
Thus,  if  the  model  is  cast  in  the  framework  of  Bai  (1994,  1995),  one  is  essentially 
assuming  an  unbounded  magnitude  of  shift.  In  this  sense,  the  consistency  of  k  for  ko 
is  not  surprising. 

The  results  of  Theorem  4  and  Theorem  5  still  hold  even  if  we  allow  the  magnitude 
of  shift  to  converge  to  zero.  Let  82<j  =  72  —  71. 

Theorem  6  For  the  cointegrated  regression  model  (16),  assume  y/T82,T  —*  00  •  For 
the  broken  trend  model  (21),  assume  T82<j  — >  00.  Under  these  assumptions,  even  if 
^2,T  ~~ >  0,  the  estimated  break  point  k  is  still  consistent  for  ko-   That  is,  P(k  —  ko)  — »  1. 

Proof  of  Theorem  6.  Under  the  assumed  magnitude  of  shift,  we  still  have  k  = 
k0  +  Op(l);  see  Bai  (1994,  1995)  and  Bai,  Lumsdaine  and  Stock  (1994). 2  Consider  the 


2In  fact,  the  assumed  magnitude  of  shift  is  stronger  than  necessary.  For  example,  for  the  broken- 
trend  model,  62,t/VT  >  c  >  0  is  sufficient  for  k  =  ko  +  Op(l).  But  the  assumption  is  necessary  for 
k  to  be  consistent  for  ko- 
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cointegrating  regression  (16).  All  that  is  needed  is  to  prove  (19)  converges  to  positive 
infinity  with  probability  1.  As  before,  we  consider  k  =  k0  —  1.  By  (20), 

X  i        k0  i        k0 


ST(k)  -  £ e?  =  Op(VT62,T^=  £  et)  +  {VT62,T~  ±  e^ 

t=l  Vi    ,=i  Vi    i=i 


(23) 


Because  -4=  Si=i  ei  converges  to  a  normal  random  variable  and  VT82J  —*  oo,  the 
right  hand  side  of  (23)  converges  to  positive  infinity.  The  proof  for  the  broken  trend 
model  is  similar  and  thus  omitted.  □ 
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